Dataset statistics
| Number of variables | 25 |
|---|---|
| Number of observations | 32060 |
| Missing cells | 29033 |
| Missing cells (%) | 3.6% |
| Duplicate rows | 0 |
| Duplicate rows (%) | 0.0% |
| Total size in memory | 25.1 MiB |
| Average record size in memory | 822.3 B |
Variable types
| Numeric | 12 |
|---|---|
| Categorical | 13 |
target is highly correlated with created_account | High correlation |
has_married is highly correlated with marital_status | High correlation |
created_account is highly correlated with target | High correlation |
marital_status is highly correlated with has_married | High correlation |
target has 29033 (90.6%) missing values | Missing |
capital_gain has 29380 (91.6%) zeros | Zeros |
capital_loss has 30568 (95.3%) zeros | Zeros |
total_months_with_employer has 494 (1.5%) zeros | Zeros |
Reproduction
| Analysis started | 2021-11-28 16:48:32.282302 |
|---|---|
| Analysis finished | 2021-11-28 16:48:59.740743 |
| Duration | 27.46 seconds |
| Software version | pandas-profiling v2.11.0 |
| Download configuration | config.yaml |
age
Real number (ℝ≥0)
| Distinct | 73 |
|---|---|
| Distinct (%) | 0.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 38.56481597 |
|---|---|
| Minimum | 17 |
| Maximum | 90 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 250.6 KiB |
Quantile statistics
| Minimum | 17 |
|---|---|
| 5-th percentile | 19 |
| Q1 | 28 |
| median | 37 |
| Q3 | 48 |
| 95-th percentile | 63 |
| Maximum | 90 |
| Range | 73 |
| Interquartile range (IQR) | 20 |
Descriptive statistics
| Standard deviation | 13.63753249 |
|---|---|
| Coefficient of variation (CV) | 0.3536262821 |
| Kurtosis | -0.1711930329 |
| Mean | 38.56481597 |
| Median Absolute Deviation (MAD) | 10 |
| Skewness | 0.5582934334 |
| Sum | 1236388 |
| Variance | 185.9822924 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) |
| 36 | 883 | 2.8% |
| 31 | 873 | 2.7% |
| 34 | 867 | 2.7% |
| 35 | 865 | 2.7% |
| 23 | 864 | 2.7% |
| 33 | 861 | 2.7% |
| 28 | 856 | 2.7% |
| 30 | 848 | 2.6% |
| 37 | 841 | 2.6% |
| 25 | 832 | 2.6% |
| Other values (63) | 23470 |
| Value | Count | Frequency (%) |
| 17 | 393 | |
| 18 | 541 | |
| 19 | 703 | |
| 20 | 745 | |
| 21 | 711 | |
| 22 | 749 | |
| 23 | 864 | |
| 24 | 786 | |
| 25 | 832 | |
| 26 | 771 |
| Value | Count | Frequency (%) |
| 90 | 41 | |
| 88 | 3 | < 0.1% |
| 87 | 1 | < 0.1% |
| 86 | 1 | < 0.1% |
| 85 | 2 | < 0.1% |
| 84 | 10 | < 0.1% |
| 83 | 6 | < 0.1% |
| 82 | 11 | < 0.1% |
| 81 | 19 | |
| 80 | 22 |
| Distinct | 7 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 2.2 MiB |
| Married-civ-spouse | |
|---|---|
| Never-married | |
| Divorced | |
| Separated | 1007 |
| Widowed | 976 |
| Other values (2) | 434 |
Length
| Max length | 21 |
|---|---|
| Median length | 13 |
| Mean length | 14.41628197 |
| Min length | 7 |
Characters and Unicode
| Total characters | 462186 |
|---|---|
| Distinct characters | 24 |
| Distinct categories | 3 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | Never-married |
|---|---|
| 2nd row | Married-civ-spouse |
| 3rd row | Divorced |
| 4th row | Married-civ-spouse |
| 5th row | Married-civ-spouse |
| Value | Count | Frequency (%) |
| Married-civ-spouse | 14747 | |
| Never-married | 10531 | |
| Divorced | 4365 | 13.6% |
| Separated | 1007 | 3.1% |
| Widowed | 976 | 3.0% |
| Married-spouse-absent | 411 | 1.3% |
| Married-AF-spouse | 23 | 0.1% |
| Value | Count | Frequency (%) |
| married-civ-spouse | 14747 | |
| never-married | 10531 | |
| divorced | 4365 | 13.6% |
| separated | 1007 | 3.1% |
| widowed | 976 | 3.0% |
| married-spouse-absent | 411 | 1.3% |
| married-af-spouse | 23 | 0.1% |
Most occurring characters
| Value | Count | Frequency (%) |
| e | 69721 | |
| r | 67327 | |
| i | 45800 | |
| - | 40893 | |
| d | 33036 | |
| s | 30773 | 6.7% |
| v | 29643 | 6.4% |
| a | 28137 | 6.1% |
| o | 20522 | 4.4% |
| c | 19112 | 4.1% |
| Other values (14) | 77222 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 389187 | |
| Dash Punctuation | 40893 | 8.8% |
| Uppercase Letter | 32106 | 6.9% |
Most frequent character per category
| Value | Count | Frequency (%) |
| e | 69721 | |
| r | 67327 | |
| i | 45800 | |
| d | 33036 | |
| s | 30773 | |
| v | 29643 | |
| a | 28137 | |
| o | 20522 | 5.3% |
| c | 19112 | 4.9% |
| p | 16188 | 4.2% |
| Other values (6) | 28928 |
| Value | Count | Frequency (%) |
| M | 15181 | |
| N | 10531 | |
| D | 4365 | 13.6% |
| S | 1007 | 3.1% |
| W | 976 | 3.0% |
| A | 23 | 0.1% |
| F | 23 | 0.1% |
| Value | Count | Frequency (%) |
| - | 40893 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 421293 | |
| Common | 40893 | 8.8% |
Most frequent character per script
| Value | Count | Frequency (%) |
| e | 69721 | |
| r | 67327 | |
| i | 45800 | |
| d | 33036 | |
| s | 30773 | |
| v | 29643 | |
| a | 28137 | |
| o | 20522 | 4.9% |
| c | 19112 | 4.5% |
| p | 16188 | 3.8% |
| Other values (13) | 61034 |
| Value | Count | Frequency (%) |
| - | 40893 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 462186 |
Most frequent character per block
| Value | Count | Frequency (%) |
| e | 69721 | |
| r | 67327 | |
| i | 45800 | |
| - | 40893 | |
| d | 33036 | |
| s | 30773 | 6.7% |
| v | 29643 | 6.4% |
| a | 28137 | 6.1% |
| o | 20522 | 4.4% |
| c | 19112 | 4.1% |
| Other values (14) | 77222 |
education
Categorical
| Distinct | 16 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 2.0 MiB |
| HS-grad | |
|---|---|
| Some-college | |
| Bachelors | |
| Masters | |
| Assoc-voc | |
| Other values (11) |
Length
| Max length | 12 |
|---|---|
| Median length | 7 |
| Mean length | 8.435776669 |
| Min length | 3 |
Characters and Unicode
| Total characters | 270451 |
|---|---|
| Distinct characters | 31 |
| Distinct categories | 4 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | Bachelors |
|---|---|
| 2nd row | Bachelors |
| 3rd row | HS-grad |
| 4th row | 11th |
| 5th row | Bachelors |
| Value | Count | Frequency (%) |
| HS-grad | 10347 | |
| Some-college | 7190 | |
| Bachelors | 5278 | |
| Masters | 1693 | 5.3% |
| Assoc-voc | 1357 | 4.2% |
| 11th | 1159 | 3.6% |
| Assoc-acdm | 1050 | 3.3% |
| 10th | 919 | 2.9% |
| 7th-8th | 634 | 2.0% |
| Prof-school | 567 | 1.8% |
| Other values (6) | 1866 | 5.8% |
| Value | Count | Frequency (%) |
| hs-grad | 10347 | |
| some-college | 7190 | |
| bachelors | 5278 | |
| masters | 1693 | 5.3% |
| assoc-voc | 1357 | 4.2% |
| 11th | 1159 | 3.6% |
| assoc-acdm | 1050 | 3.3% |
| 10th | 919 | 2.9% |
| 7th-8th | 634 | 2.0% |
| prof-school | 567 | 1.8% |
| Other values (6) | 1866 | 5.8% |
Most occurring characters
| Value | Count | Frequency (%) |
| e | 28991 | |
| o | 26023 | 9.6% |
| - | 21638 | 8.0% |
| l | 20273 | 7.5% |
| a | 18770 | 6.9% |
| r | 18335 | 6.8% |
| c | 18299 | 6.8% |
| S | 17537 | 6.5% |
| g | 17537 | 6.5% |
| s | 14258 | 5.3% |
| Other values (21) | 68790 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 202782 | |
| Uppercase Letter | 38279 | 14.2% |
| Dash Punctuation | 21638 | 8.0% |
| Decimal Number | 7752 | 2.9% |
Most frequent character per category
| Value | Count | Frequency (%) |
| e | 28991 | |
| o | 26023 | |
| l | 20273 | |
| a | 18770 | |
| r | 18335 | |
| c | 18299 | |
| g | 17537 | |
| s | 14258 | |
| d | 11397 | 5.6% |
| h | 10983 | 5.4% |
| Other values (4) | 17916 |
| Value | Count | Frequency (%) |
| 1 | 3821 | |
| 0 | 919 | 11.9% |
| 7 | 634 | 8.2% |
| 8 | 634 | 8.2% |
| 9 | 504 | 6.5% |
| 2 | 419 | 5.4% |
| 5 | 328 | 4.2% |
| 6 | 328 | 4.2% |
| 4 | 165 | 2.1% |
| Value | Count | Frequency (%) |
| S | 17537 | |
| H | 10347 | |
| B | 5278 | 13.8% |
| A | 2407 | 6.3% |
| M | 1693 | 4.4% |
| P | 615 | 1.6% |
| D | 402 | 1.1% |
| Value | Count | Frequency (%) |
| - | 21638 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 241061 | |
| Common | 29390 | 10.9% |
Most frequent character per script
| Value | Count | Frequency (%) |
| e | 28991 | |
| o | 26023 | |
| l | 20273 | |
| a | 18770 | 7.8% |
| r | 18335 | 7.6% |
| c | 18299 | 7.6% |
| S | 17537 | 7.3% |
| g | 17537 | 7.3% |
| s | 14258 | 5.9% |
| d | 11397 | 4.7% |
| Other values (11) | 49641 |
| Value | Count | Frequency (%) |
| - | 21638 | |
| 1 | 3821 | 13.0% |
| 0 | 919 | 3.1% |
| 7 | 634 | 2.2% |
| 8 | 634 | 2.2% |
| 9 | 504 | 1.7% |
| 2 | 419 | 1.4% |
| 5 | 328 | 1.1% |
| 6 | 328 | 1.1% |
| 4 | 165 | 0.6% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 270451 |
Most frequent character per block
| Value | Count | Frequency (%) |
| e | 28991 | |
| o | 26023 | 9.6% |
| - | 21638 | 8.0% |
| l | 20273 | 7.5% |
| a | 18770 | 6.9% |
| r | 18335 | 6.8% |
| c | 18299 | 6.8% |
| S | 17537 | 6.5% |
| g | 17537 | 6.5% |
| s | 14258 | 5.3% |
| Other values (21) | 68790 |
occupation_level
Real number (ℝ≥0)
| Distinct | 20 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 7.757673113 |
|---|---|
| Minimum | 1 |
| Maximum | 20 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 250.6 KiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 2 |
| Q1 | 5 |
| median | 8 |
| Q3 | 10 |
| 95-th percentile | 14 |
| Maximum | 20 |
| Range | 19 |
| Interquartile range (IQR) | 5 |
Descriptive statistics
| Standard deviation | 3.859709422 |
|---|---|
| Coefficient of variation (CV) | 0.4975344238 |
| Kurtosis | -0.4598044946 |
| Mean | 7.757673113 |
| Median Absolute Deviation (MAD) | 3 |
| Skewness | 0.2736765035 |
| Sum | 248711 |
| Variance | 14.89735682 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) |
| 8 | 3687 | |
| 6 | 3485 | |
| 10 | 3172 | |
| 4 | 2698 | |
| 7 | 2388 | 7.4% |
| 12 | 2224 | 6.9% |
| 9 | 2136 | 6.7% |
| 5 | 2008 | 6.3% |
| 2 | 1827 | 5.7% |
| 3 | 1777 | 5.5% |
| Other values (10) | 6658 |
| Value | Count | Frequency (%) |
| 1 | 1208 | 3.8% |
| 2 | 1827 | |
| 3 | 1777 | |
| 4 | 2698 | |
| 5 | 2008 | |
| 6 | 3485 | |
| 7 | 2388 | |
| 8 | 3687 | |
| 9 | 2136 | |
| 10 | 3172 |
| Value | Count | Frequency (%) |
| 20 | 31 | 0.1% |
| 19 | 56 | 0.2% |
| 18 | 207 | 0.6% |
| 17 | 179 | 0.6% |
| 16 | 417 | 1.3% |
| 15 | 440 | 1.4% |
| 14 | 1083 | |
| 13 | 1319 | |
| 12 | 2224 | |
| 11 | 1718 |
education_num
Real number (ℝ≥0)
| Distinct | 16 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 13.20761073 |
|---|---|
| Minimum | 1 |
| Maximum | 21 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 250.6 KiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 6 |
| Q1 | 12 |
| median | 13 |
| Q3 | 16 |
| 95-th percentile | 18 |
| Maximum | 21 |
| Range | 20 |
| Interquartile range (IQR) | 4 |
Descriptive statistics
| Standard deviation | 3.353796886 |
|---|---|
| Coefficient of variation (CV) | 0.2539291137 |
| Kurtosis | 0.7716332024 |
| Mean | 13.20761073 |
| Median Absolute Deviation (MAD) | 1 |
| Skewness | -0.364423994 |
| Sum | 423436 |
| Variance | 11.24795355 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) |
| 12 | 10347 | |
| 13 | 7190 | |
| 17 | 5278 | |
| 18 | 1693 | 5.3% |
| 14 | 1357 | 4.2% |
| 9 | 1159 | 3.6% |
| 16 | 1050 | 3.3% |
| 8 | 919 | 2.9% |
| 5 | 634 | 2.0% |
| 20 | 567 | 1.8% |
| Other values (6) | 1866 | 5.8% |
| Value | Count | Frequency (%) |
| 1 | 48 | 0.1% |
| 3 | 165 | 0.5% |
| 4 | 328 | 1.0% |
| 5 | 634 | 2.0% |
| 6 | 504 | 1.6% |
| 8 | 919 | 2.9% |
| 9 | 1159 | 3.6% |
| 10 | 419 | 1.3% |
| 12 | 10347 | |
| 13 | 7190 |
| Value | Count | Frequency (%) |
| 21 | 402 | 1.3% |
| 20 | 567 | 1.8% |
| 18 | 1693 | 5.3% |
| 17 | 5278 | |
| 16 | 1050 | 3.3% |
| 14 | 1357 | 4.2% |
| 13 | 7190 | |
| 12 | 10347 | |
| 10 | 419 | 1.3% |
| 9 | 1159 | 3.6% |
familiarity_FB
Real number (ℝ≥0)
| Distinct | 10 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 5.29033063 |
|---|---|
| Minimum | 1 |
| Maximum | 10 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 250.6 KiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 1 |
| Q1 | 3 |
| median | 5 |
| Q3 | 8 |
| 95-th percentile | 9 |
| Maximum | 10 |
| Range | 9 |
| Interquartile range (IQR) | 5 |
Descriptive statistics
| Standard deviation | 2.673795294 |
|---|---|
| Coefficient of variation (CV) | 0.5054117561 |
| Kurtosis | -1.163045096 |
| Mean | 5.29033063 |
| Median Absolute Deviation (MAD) | 2 |
| Skewness | 0.0116450518 |
| Sum | 169608 |
| Variance | 7.149181275 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) |
| 5 | 3547 | |
| 2 | 3504 | |
| 7 | 3494 | |
| 6 | 3493 | |
| 8 | 3486 | |
| 9 | 3467 | |
| 4 | 3461 | |
| 3 | 3432 | |
| 1 | 2838 | |
| 10 | 1338 | 4.2% |
| Value | Count | Frequency (%) |
| 1 | 2838 | |
| 2 | 3504 | |
| 3 | 3432 | |
| 4 | 3461 | |
| 5 | 3547 | |
| 6 | 3493 | |
| 7 | 3494 | |
| 8 | 3486 | |
| 9 | 3467 | |
| 10 | 1338 | 4.2% |
| Value | Count | Frequency (%) |
| 10 | 1338 | 4.2% |
| 9 | 3467 | |
| 8 | 3486 | |
| 7 | 3494 | |
| 6 | 3493 | |
| 5 | 3547 | |
| 4 | 3461 | |
| 3 | 3432 | |
| 2 | 3504 | |
| 1 | 2838 |
view_FB
Real number (ℝ≥0)
| Distinct | 10 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 5.170929507 |
|---|---|
| Minimum | 1 |
| Maximum | 10 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 250.6 KiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 1 |
| Q1 | 3 |
| median | 5 |
| Q3 | 7 |
| 95-th percentile | 9 |
| Maximum | 10 |
| Range | 9 |
| Interquartile range (IQR) | 4 |
Descriptive statistics
| Standard deviation | 2.550474931 |
|---|---|
| Coefficient of variation (CV) | 0.4932333591 |
| Kurtosis | -1.137469448 |
| Mean | 5.170929507 |
| Median Absolute Deviation (MAD) | 2 |
| Skewness | 0.01187618404 |
| Sum | 165780 |
| Variance | 6.504922371 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) |
| 6 | 3777 | |
| 4 | 3739 | |
| 3 | 3725 | |
| 8 | 3663 | |
| 7 | 3634 | |
| 5 | 3593 | |
| 2 | 3528 | |
| 9 | 3179 | |
| 1 | 2623 | |
| 10 | 599 | 1.9% |
| Value | Count | Frequency (%) |
| 1 | 2623 | |
| 2 | 3528 | |
| 3 | 3725 | |
| 4 | 3739 | |
| 5 | 3593 | |
| 6 | 3777 | |
| 7 | 3634 | |
| 8 | 3663 | |
| 9 | 3179 | |
| 10 | 599 | 1.9% |
| Value | Count | Frequency (%) |
| 10 | 599 | 1.9% |
| 9 | 3179 | |
| 8 | 3663 | |
| 7 | 3634 | |
| 6 | 3777 | |
| 5 | 3593 | |
| 4 | 3739 | |
| 3 | 3725 | |
| 2 | 3528 | |
| 1 | 2623 |
interested_insurance
Categorical
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 1.8 MiB |
| 0 | |
|---|---|
| 1 |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Characters and Unicode
| Total characters | 32060 |
|---|---|
| Distinct characters | 2 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 0 |
|---|---|
| 2nd row | 1 |
| 3rd row | 1 |
| 4th row | 0 |
| 5th row | 1 |
| Value | Count | Frequency (%) |
| 0 | 18438 | |
| 1 | 13622 |
| Value | Count | Frequency (%) |
| 0 | 18438 | |
| 1 | 13622 |
Most occurring characters
| Value | Count | Frequency (%) |
| 0 | 18438 | |
| 1 | 13622 |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 32060 |
Most frequent character per category
| Value | Count | Frequency (%) |
| 0 | 18438 | |
| 1 | 13622 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 32060 |
Most frequent character per script
| Value | Count | Frequency (%) |
| 0 | 18438 | |
| 1 | 13622 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 32060 |
Most frequent character per block
| Value | Count | Frequency (%) |
| 0 | 18438 | |
| 1 | 13622 |
| Distinct | 3 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 1.9 MiB |
| unknown | |
|---|---|
| No | 2787 |
| Yes | 240 |
Length
| Max length | 7 |
|---|---|
| Median length | 7 |
| Mean length | 6.535402371 |
| Min length | 2 |
Characters and Unicode
| Total characters | 209525 |
|---|---|
| Distinct characters | 9 |
| Distinct categories | 2 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | No |
|---|---|
| 2nd row | No |
| 3rd row | No |
| 4th row | No |
| 5th row | No |
| Value | Count | Frequency (%) |
| unknown | 29033 | |
| No | 2787 | 8.7% |
| Yes | 240 | 0.7% |
| Value | Count | Frequency (%) |
| unknown | 29033 | |
| no | 2787 | 8.7% |
| yes | 240 | 0.7% |
Most occurring characters
| Value | Count | Frequency (%) |
| n | 87099 | |
| o | 31820 | 15.2% |
| u | 29033 | 13.9% |
| k | 29033 | 13.9% |
| w | 29033 | 13.9% |
| N | 2787 | 1.3% |
| Y | 240 | 0.1% |
| e | 240 | 0.1% |
| s | 240 | 0.1% |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 206498 | |
| Uppercase Letter | 3027 | 1.4% |
Most frequent character per category
| Value | Count | Frequency (%) |
| n | 87099 | |
| o | 31820 | 15.4% |
| u | 29033 | 14.1% |
| k | 29033 | 14.1% |
| w | 29033 | 14.1% |
| e | 240 | 0.1% |
| s | 240 | 0.1% |
| Value | Count | Frequency (%) |
| N | 2787 | |
| Y | 240 | 7.9% |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 209525 |
Most frequent character per script
| Value | Count | Frequency (%) |
| n | 87099 | |
| o | 31820 | 15.2% |
| u | 29033 | 13.9% |
| k | 29033 | 13.9% |
| w | 29033 | 13.9% |
| N | 2787 | 1.3% |
| Y | 240 | 0.1% |
| e | 240 | 0.1% |
| s | 240 | 0.1% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 209525 |
Most frequent character per block
| Value | Count | Frequency (%) |
| n | 87099 | |
| o | 31820 | 15.2% |
| u | 29033 | 13.9% |
| k | 29033 | 13.9% |
| w | 29033 | 13.9% |
| N | 2787 | 1.3% |
| Y | 240 | 0.1% |
| e | 240 | 0.1% |
| s | 240 | 0.1% |
| Distinct | 2 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 1.8 MiB |
| 1 | |
|---|---|
| 0 |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Characters and Unicode
| Total characters | 32060 |
|---|---|
| Distinct characters | 2 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 0 |
|---|---|
| 2nd row | 1 |
| 3rd row | 0 |
| 4th row | 1 |
| 5th row | 1 |
| Value | Count | Frequency (%) |
| 1 | 17164 | |
| 0 | 14896 |
| Value | Count | Frequency (%) |
| 1 | 17164 | |
| 0 | 14896 |
Most occurring characters
| Value | Count | Frequency (%) |
| 1 | 17164 | |
| 0 | 14896 |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 32060 |
Most frequent character per category
| Value | Count | Frequency (%) |
| 1 | 17164 | |
| 0 | 14896 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 32060 |
Most frequent character per script
| Value | Count | Frequency (%) |
| 1 | 17164 | |
| 0 | 14896 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 32060 |
Most frequent character per block
| Value | Count | Frequency (%) |
| 1 | 17164 | |
| 0 | 14896 |
education_order
Real number (ℝ≥0)
| Distinct | 16 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 9.851528384 |
|---|---|
| Minimum | 1 |
| Maximum | 16 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 250.6 KiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 5 |
| Q1 | 9 |
| median | 10 |
| Q3 | 11 |
| 95-th percentile | 15 |
| Maximum | 16 |
| Range | 15 |
| Interquartile range (IQR) | 2 |
Descriptive statistics
| Standard deviation | 2.411522885 |
|---|---|
| Coefficient of variation (CV) | 0.2447866758 |
| Kurtosis | 1.603895409 |
| Mean | 9.851528384 |
| Median Absolute Deviation (MAD) | 1 |
| Skewness | -0.1831421012 |
| Sum | 315840 |
| Variance | 5.815442625 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) |
| 9 | 10347 | |
| 10 | 7190 | |
| 11 | 5278 | |
| 12 | 1693 | 5.3% |
| 14 | 1357 | 4.2% |
| 7 | 1159 | 3.6% |
| 15 | 1050 | 3.3% |
| 6 | 919 | 2.9% |
| 4 | 634 | 2.0% |
| 16 | 567 | 1.8% |
| Other values (6) | 1866 | 5.8% |
| Value | Count | Frequency (%) |
| 1 | 48 | 0.1% |
| 2 | 165 | 0.5% |
| 3 | 328 | 1.0% |
| 4 | 634 | 2.0% |
| 5 | 504 | 1.6% |
| 6 | 919 | 2.9% |
| 7 | 1159 | 3.6% |
| 8 | 419 | 1.3% |
| 9 | 10347 | |
| 10 | 7190 |
| Value | Count | Frequency (%) |
| 16 | 567 | 1.8% |
| 15 | 1050 | 3.3% |
| 14 | 1357 | 4.2% |
| 13 | 402 | 1.3% |
| 12 | 1693 | 5.3% |
| 11 | 5278 | |
| 10 | 7190 | |
| 9 | 10347 | |
| 8 | 419 | 1.3% |
| 7 | 1159 | 3.6% |
job_title_top_10
Categorical
| Distinct | 11 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 2.0 MiB |
| Other | |
|---|---|
| Accountant, chartered | 484 |
| Engineer, manufacturing | 481 |
| Amenity horticulturist | 462 |
| Tutor | 456 |
| Other values (6) | 2277 |
Length
| Max length | 33 |
|---|---|
| Median length | 5 |
| Mean length | 7.105427324 |
| Min length | 5 |
Characters and Unicode
| Total characters | 227800 |
|---|---|
| Distinct characters | 27 |
| Distinct categories | 4 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | Other |
|---|---|
| 2nd row | Other |
| 3rd row | Other |
| 4th row | Other |
| 5th row | Other |
| Value | Count | Frequency (%) |
| Other | 27900 | |
| Accountant, chartered | 484 | 1.5% |
| Engineer, manufacturing | 481 | 1.5% |
| Amenity horticulturist | 462 | 1.4% |
| Tutor | 456 | 1.4% |
| Education officer, community | 443 | 1.4% |
| Environmental health practitioner | 436 | 1.4% |
| Event organiser | 421 | 1.3% |
| Conservator, museum/gallery | 329 | 1.0% |
| Clinical psychologist | 325 | 1.0% |
| Value | Count | Frequency (%) |
| other | 27900 | |
| chartered | 484 | 1.3% |
| accountant | 484 | 1.3% |
| manufacturing | 481 | 1.3% |
| engineer | 481 | 1.3% |
| amenity | 462 | 1.3% |
| horticulturist | 462 | 1.3% |
| tutor | 456 | 1.2% |
| community | 443 | 1.2% |
| officer | 443 | 1.2% |
| Other values (12) | 4547 | 12.4% |
Most occurring characters
| Value | Count | Frequency (%) |
| t | 36165 | |
| r | 34790 | |
| e | 34518 | |
| h | 30366 | |
| O | 27900 | |
| n | 7803 | 3.4% |
| i | 7027 | 3.1% |
| a | 5731 | 2.5% |
| o | 5655 | 2.5% |
| c | 5456 | 2.4% |
| Other values (17) | 32389 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 188768 | |
| Uppercase Letter | 32383 | 14.2% |
| Space Separator | 4583 | 2.0% |
| Other Punctuation | 2066 | 0.9% |
Most frequent character per category
| Value | Count | Frequency (%) |
| t | 36165 | |
| r | 34790 | |
| e | 34518 | |
| h | 30366 | |
| n | 7803 | 4.1% |
| i | 7027 | 3.7% |
| a | 5731 | 3.0% |
| o | 5655 | 3.0% |
| c | 5456 | 2.9% |
| u | 4370 | 2.3% |
| Other values (9) | 16887 |
| Value | Count | Frequency (%) |
| O | 27900 | |
| E | 1781 | 5.5% |
| A | 1269 | 3.9% |
| T | 779 | 2.4% |
| C | 654 | 2.0% |
| Value | Count | Frequency (%) |
| , | 1737 | |
| / | 329 | 15.9% |
| Value | Count | Frequency (%) |
| 4583 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 221151 | |
| Common | 6649 | 2.9% |
Most frequent character per script
| Value | Count | Frequency (%) |
| t | 36165 | |
| r | 34790 | |
| e | 34518 | |
| h | 30366 | |
| O | 27900 | |
| n | 7803 | 3.5% |
| i | 7027 | 3.2% |
| a | 5731 | 2.6% |
| o | 5655 | 2.6% |
| c | 5456 | 2.5% |
| Other values (14) | 25740 |
| Value | Count | Frequency (%) |
| 4583 | ||
| , | 1737 | 26.1% |
| / | 329 | 4.9% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 227800 |
Most frequent character per block
| Value | Count | Frequency (%) |
| t | 36165 | |
| r | 34790 | |
| e | 34518 | |
| h | 30366 | |
| O | 27900 | |
| n | 7803 | 3.4% |
| i | 7027 | 3.1% |
| a | 5731 | 2.5% |
| o | 5655 | 2.5% |
| c | 5456 | 2.4% |
| Other values (17) | 32389 |
company_email_address
Categorical
| Distinct | 11 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 1.9 MiB |
| Other | |
|---|---|
| smith.com | 379 |
| jones.com | 309 |
| williams.com | 206 |
| brown.com | 176 |
| Other values (6) | 808 |
Length
| Max length | 12 |
|---|---|
| Median length | 5 |
| Mean length | 5.28147224 |
| Min length | 5 |
Characters and Unicode
| Total characters | 169324 |
|---|---|
| Distinct characters | 20 |
| Distinct categories | 3 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | jones.com |
|---|---|
| 2nd row | Other |
| 3rd row | Other |
| 4th row | Other |
| 5th row | Other |
| Value | Count | Frequency (%) |
| Other | 30182 | |
| smith.com | 379 | 1.2% |
| jones.com | 309 | 1.0% |
| williams.com | 206 | 0.6% |
| brown.com | 176 | 0.5% |
| davies.com | 163 | 0.5% |
| taylor.com | 163 | 0.5% |
| evans.com | 137 | 0.4% |
| wilson.com | 122 | 0.4% |
| johnson.com | 112 | 0.3% |
| Value | Count | Frequency (%) |
| other | 30182 | |
| smith.com | 379 | 1.2% |
| jones.com | 309 | 1.0% |
| williams.com | 206 | 0.6% |
| brown.com | 176 | 0.5% |
| davies.com | 163 | 0.5% |
| taylor.com | 163 | 0.5% |
| evans.com | 137 | 0.4% |
| wilson.com | 122 | 0.4% |
| johnson.com | 112 | 0.3% |
Most occurring characters
| Value | Count | Frequency (%) |
| e | 30902 | |
| t | 30835 | |
| r | 30743 | |
| h | 30673 | |
| O | 30182 | |
| o | 2983 | 1.8% |
| m | 2463 | 1.5% |
| . | 1878 | 1.1% |
| c | 1878 | 1.1% |
| s | 1539 | 0.9% |
| Other values (10) | 5248 | 3.1% |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 137264 | |
| Uppercase Letter | 30182 | 17.8% |
| Other Punctuation | 1878 | 1.1% |
Most frequent character per category
| Value | Count | Frequency (%) |
| e | 30902 | |
| t | 30835 | |
| r | 30743 | |
| h | 30673 | |
| o | 2983 | 2.2% |
| m | 2463 | 1.8% |
| c | 1878 | 1.4% |
| s | 1539 | 1.1% |
| i | 1076 | 0.8% |
| n | 968 | 0.7% |
| Other values (8) | 3204 | 2.3% |
| Value | Count | Frequency (%) |
| . | 1878 |
| Value | Count | Frequency (%) |
| O | 30182 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 167446 | |
| Common | 1878 | 1.1% |
Most frequent character per script
| Value | Count | Frequency (%) |
| e | 30902 | |
| t | 30835 | |
| r | 30743 | |
| h | 30673 | |
| O | 30182 | |
| o | 2983 | 1.8% |
| m | 2463 | 1.5% |
| c | 1878 | 1.1% |
| s | 1539 | 0.9% |
| i | 1076 | 0.6% |
| Other values (9) | 4172 | 2.5% |
| Value | Count | Frequency (%) |
| . | 1878 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 169324 |
Most frequent character per block
| Value | Count | Frequency (%) |
| e | 30902 | |
| t | 30835 | |
| r | 30743 | |
| h | 30673 | |
| O | 30182 | |
| o | 2983 | 1.8% |
| m | 2463 | 1.5% |
| . | 1878 | 1.1% |
| c | 1878 | 1.1% |
| s | 1539 | 0.9% |
| Other values (10) | 5248 | 3.1% |
hours_per_week
Real number (ℝ≥0)
| Distinct | 94 |
|---|---|
| Distinct (%) | 0.3% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 40.4334685 |
|---|---|
| Minimum | 1 |
| Maximum | 99 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 250.6 KiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 18 |
| Q1 | 40 |
| median | 40 |
| Q3 | 45 |
| 95-th percentile | 60 |
| Maximum | 99 |
| Range | 98 |
| Interquartile range (IQR) | 5 |
Descriptive statistics
| Standard deviation | 12.33371899 |
|---|---|
| Coefficient of variation (CV) | 0.3050373725 |
| Kurtosis | 2.919402385 |
| Mean | 40.4334685 |
| Median Absolute Deviation (MAD) | 3 |
| Skewness | 0.2268609529 |
| Sum | 1296297 |
| Variance | 152.1206241 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) |
| 40 | 15004 | |
| 50 | 2779 | 8.7% |
| 45 | 1788 | 5.6% |
| 60 | 1453 | 4.5% |
| 35 | 1277 | 4.0% |
| 20 | 1201 | 3.7% |
| 30 | 1135 | 3.5% |
| 55 | 681 | 2.1% |
| 25 | 663 | 2.1% |
| 48 | 508 | 1.6% |
| Other values (84) | 5571 | 17.4% |
| Value | Count | Frequency (%) |
| 1 | 19 | 0.1% |
| 2 | 32 | 0.1% |
| 3 | 37 | 0.1% |
| 4 | 53 | 0.2% |
| 5 | 58 | 0.2% |
| 6 | 64 | 0.2% |
| 7 | 26 | 0.1% |
| 8 | 142 | |
| 9 | 18 | 0.1% |
| 10 | 272 |
| Value | Count | Frequency (%) |
| 99 | 83 | |
| 98 | 11 | < 0.1% |
| 97 | 2 | < 0.1% |
| 96 | 5 | < 0.1% |
| 95 | 2 | < 0.1% |
| 94 | 1 | < 0.1% |
| 92 | 1 | < 0.1% |
| 91 | 3 | < 0.1% |
| 90 | 29 | 0.1% |
| 89 | 1 | < 0.1% |
| Distinct | 119 |
|---|---|
| Distinct (%) | 0.4% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1076.781815 |
|---|---|
| Minimum | 0 |
| Maximum | 99999 |
| Zeros | 29380 |
| Zeros (%) | 91.6% |
| Memory size | 250.6 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 0 |
| Q3 | 0 |
| 95-th percentile | 5013 |
| Maximum | 99999 |
| Range | 99999 |
| Interquartile range (IQR) | 0 |
Descriptive statistics
| Standard deviation | 7373.301479 |
|---|---|
| Coefficient of variation (CV) | 6.847535289 |
| Kurtosis | 155.2741732 |
| Mean | 1076.781815 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 11.97046873 |
| Sum | 34521625 |
| Variance | 54365574.7 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 29380 | |
| 15024 | 342 | 1.1% |
| 7688 | 282 | 0.9% |
| 7298 | 241 | 0.8% |
| 99999 | 156 | 0.5% |
| 3103 | 96 | 0.3% |
| 5178 | 96 | 0.3% |
| 4386 | 70 | 0.2% |
| 5013 | 69 | 0.2% |
| 8614 | 55 | 0.2% |
| Other values (109) | 1273 | 4.0% |
| Value | Count | Frequency (%) |
| 0 | 29380 | |
| 114 | 6 | < 0.1% |
| 401 | 2 | < 0.1% |
| 594 | 34 | 0.1% |
| 914 | 8 | < 0.1% |
| 991 | 5 | < 0.1% |
| 1055 | 24 | 0.1% |
| 1086 | 3 | < 0.1% |
| 1111 | 1 | < 0.1% |
| 1151 | 8 | < 0.1% |
| Value | Count | Frequency (%) |
| 99999 | 156 | |
| 41310 | 2 | < 0.1% |
| 34095 | 5 | < 0.1% |
| 27828 | 33 | 0.1% |
| 25236 | 11 | < 0.1% |
| 25124 | 4 | < 0.1% |
| 22040 | 1 | < 0.1% |
| 20051 | 36 | 0.1% |
| 18481 | 2 | < 0.1% |
| 15831 | 5 | < 0.1% |
| Distinct | 91 |
|---|---|
| Distinct (%) | 0.3% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 87.09937617 |
|---|---|
| Minimum | 0 |
| Maximum | 4356 |
| Zeros | 30568 |
| Zeros (%) | 95.3% |
| Memory size | 250.6 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 0 |
| Q3 | 0 |
| 95-th percentile | 0 |
| Maximum | 4356 |
| Range | 4356 |
| Interquartile range (IQR) | 0 |
Descriptive statistics
| Standard deviation | 402.5526524 |
|---|---|
| Coefficient of variation (CV) | 4.621762751 |
| Kurtosis | 20.46029417 |
| Mean | 87.09937617 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 4.602228901 |
| Sum | 2792406 |
| Variance | 162048.638 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 30568 | |
| 1902 | 198 | 0.6% |
| 1977 | 164 | 0.5% |
| 1887 | 155 | 0.5% |
| 1848 | 51 | 0.2% |
| 1485 | 50 | 0.2% |
| 2415 | 48 | 0.1% |
| 1602 | 45 | 0.1% |
| 1740 | 42 | 0.1% |
| 1590 | 40 | 0.1% |
| Other values (81) | 699 | 2.2% |
| Value | Count | Frequency (%) |
| 0 | 30568 | |
| 155 | 1 | < 0.1% |
| 213 | 4 | < 0.1% |
| 323 | 3 | < 0.1% |
| 419 | 3 | < 0.1% |
| 625 | 12 | < 0.1% |
| 653 | 3 | < 0.1% |
| 810 | 2 | < 0.1% |
| 880 | 5 | < 0.1% |
| 974 | 2 | < 0.1% |
| Value | Count | Frequency (%) |
| 4356 | 3 | < 0.1% |
| 3900 | 2 | < 0.1% |
| 3770 | 2 | < 0.1% |
| 3683 | 2 | < 0.1% |
| 3004 | 2 | < 0.1% |
| 2824 | 10 | |
| 2754 | 2 | < 0.1% |
| 2603 | 5 | |
| 2559 | 11 | |
| 2547 | 4 | < 0.1% |
native_country
Categorical
| Distinct | 40 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 2.1 MiB |
| United Kingdom | |
|---|---|
| Scotland | 646 |
| ? | 571 |
| Poland | 253 |
| Germany | 132 |
| Other values (35) | 1732 |
Length
| Max length | 26 |
|---|---|
| Median length | 14 |
| Mean length | 13.16777916 |
| Min length | 1 |
Characters and Unicode
| Total characters | 422159 |
|---|---|
| Distinct characters | 46 |
| Distinct categories | 7 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 1 ? |
|---|---|
| Unique (%) | < 0.1% |
Sample
| 1st row | United Kingdom |
|---|---|
| 2nd row | United Kingdom |
| 3rd row | United Kingdom |
| 4th row | United Kingdom |
| 5th row | Sweden |
| Value | Count | Frequency (%) |
| United Kingdom | 28726 | |
| Scotland | 646 | 2.0% |
| ? | 571 | 1.8% |
| Poland | 253 | 0.8% |
| Germany | 132 | 0.4% |
| Canada | 119 | 0.4% |
| Bulgaria | 114 | 0.4% |
| Wales | 105 | 0.3% |
| India | 100 | 0.3% |
| Sweden | 95 | 0.3% |
| Other values (30) | 1199 | 3.7% |
| Value | Count | Frequency (%) |
| kingdom | 28726 | |
| united | 28726 | |
| scotland | 646 | 1.1% |
| 571 | 0.9% | |
| poland | 253 | 0.4% |
| germany | 132 | 0.2% |
| canada | 119 | 0.2% |
| bulgaria | 114 | 0.2% |
| wales | 105 | 0.2% |
| india | 100 | 0.2% |
| Other values (31) | 1294 | 2.1% |
Most occurring characters
| Value | Count | Frequency (%) |
| n | 59564 | |
| d | 58974 | |
| i | 58267 | |
| o | 29935 | |
| t | 29759 | |
| e | 29628 | |
| m | 29225 | |
| g | 29078 | |
| U | 28754 | |
| 28726 | ||
| Other values (36) | 40249 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 332439 | |
| Uppercase Letter | 60333 | 14.3% |
| Space Separator | 28726 | 6.8% |
| Other Punctuation | 590 | 0.1% |
| Dash Punctuation | 43 | < 0.1% |
| Open Punctuation | 14 | < 0.1% |
| Close Punctuation | 14 | < 0.1% |
Most frequent character per category
| Value | Count | Frequency (%) |
| U | 28754 | |
| K | 28726 | |
| S | 769 | 1.3% |
| P | 319 | 0.5% |
| C | 268 | 0.4% |
| I | 250 | 0.4% |
| G | 238 | 0.4% |
| J | 139 | 0.2% |
| E | 118 | 0.2% |
| N | 114 | 0.2% |
| Other values (10) | 638 | 1.1% |
| Value | Count | Frequency (%) |
| n | 59564 | |
| d | 58974 | |
| i | 58267 | |
| o | 29935 | |
| t | 29759 | |
| e | 29628 | |
| m | 29225 | |
| g | 29078 | |
| a | 3562 | 1.1% |
| l | 1587 | 0.5% |
| Other values (10) | 2860 | 0.9% |
| Value | Count | Frequency (%) |
| ? | 571 | |
| & | 19 | 3.2% |
| Value | Count | Frequency (%) |
| 28726 |
| Value | Count | Frequency (%) |
| - | 43 |
| Value | Count | Frequency (%) |
| ( | 14 |
| Value | Count | Frequency (%) |
| ) | 14 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 392772 | |
| Common | 29387 | 7.0% |
Most frequent character per script
| Value | Count | Frequency (%) |
| n | 59564 | |
| d | 58974 | |
| i | 58267 | |
| o | 29935 | |
| t | 29759 | |
| e | 29628 | |
| m | 29225 | |
| g | 29078 | |
| U | 28754 | |
| K | 28726 | |
| Other values (30) | 10862 | 2.8% |
| Value | Count | Frequency (%) |
| 28726 | ||
| ? | 571 | 1.9% |
| - | 43 | 0.1% |
| & | 19 | 0.1% |
| ( | 14 | < 0.1% |
| ) | 14 | < 0.1% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 422159 |
Most frequent character per block
| Value | Count | Frequency (%) |
| n | 59564 | |
| d | 58974 | |
| i | 58267 | |
| o | 29935 | |
| t | 29759 | |
| e | 29628 | |
| m | 29225 | |
| g | 29078 | |
| U | 28754 | |
| 28726 | ||
| Other values (36) | 40249 |
demographic_characteristic
Real number (ℝ≥0)
| Distinct | 21423 |
|---|---|
| Distinct (%) | 66.8% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 189843.6625 |
|---|---|
| Minimum | 12285 |
| Maximum | 1484705 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 250.6 KiB |
Quantile statistics
| Minimum | 12285 |
|---|---|
| 5-th percentile | 39409.85 |
| Q1 | 117789 |
| median | 178449 |
| Q3 | 237065 |
| 95-th percentile | 379778.05 |
| Maximum | 1484705 |
| Range | 1472420 |
| Interquartile range (IQR) | 119276 |
Descriptive statistics
| Standard deviation | 105680.6841 |
|---|---|
| Coefficient of variation (CV) | 0.5566721729 |
| Kurtosis | 6.262694948 |
| Mean | 189843.6625 |
| Median Absolute Deviation (MAD) | 59948.5 |
| Skewness | 1.452524423 |
| Sum | 6086387820 |
| Variance | 1.1168407 × 1010 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) |
| 203488 | 13 | < 0.1% |
| 123011 | 13 | < 0.1% |
| 164190 | 12 | < 0.1% |
| 148995 | 12 | < 0.1% |
| 113364 | 12 | < 0.1% |
| 123983 | 11 | < 0.1% |
| 111483 | 11 | < 0.1% |
| 155659 | 11 | < 0.1% |
| 120131 | 11 | < 0.1% |
| 126569 | 11 | < 0.1% |
| Other values (21413) | 31943 |
| Value | Count | Frequency (%) |
| 12285 | 1 | < 0.1% |
| 13769 | 1 | < 0.1% |
| 14878 | 1 | < 0.1% |
| 18827 | 1 | < 0.1% |
| 19214 | 1 | < 0.1% |
| 19302 | 5 | |
| 19395 | 2 | < 0.1% |
| 19410 | 1 | < 0.1% |
| 19491 | 1 | < 0.1% |
| 19520 | 1 | < 0.1% |
| Value | Count | Frequency (%) |
| 1484705 | 1 | |
| 1455435 | 1 | |
| 1366120 | 1 | |
| 1268339 | 1 | |
| 1226583 | 1 | |
| 1184622 | 1 | |
| 1161363 | 1 | |
| 1125613 | 1 | |
| 1097453 | 1 | |
| 1085515 | 1 |
town_adj
Categorical
| Distinct | 6 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 2.0 MiB |
| Edinburgh | |
|---|---|
| Swindon | |
| Other | |
| Leeds | 1510 |
| Oxford | 1393 |
Length
| Max length | 9 |
|---|---|
| Median length | 9 |
| Mean length | 7.828852152 |
| Min length | 5 |
Characters and Unicode
| Total characters | 250993 |
|---|---|
| Distinct characters | 21 |
| Distinct categories | 2 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | Edinburgh |
|---|---|
| 2nd row | Leeds |
| 3rd row | Edinburgh |
| 4th row | Edinburgh |
| 5th row | Swindon |
| Value | Count | Frequency (%) |
| Edinburgh | 19228 | |
| Swindon | 5170 | 16.1% |
| Other | 3735 | 11.7% |
| Leeds | 1510 | 4.7% |
| Oxford | 1393 | 4.3% |
| Bristol | 1024 | 3.2% |
| Value | Count | Frequency (%) |
| edinburgh | 19228 | |
| swindon | 5170 | 16.1% |
| other | 3735 | 11.7% |
| leeds | 1510 | 4.7% |
| oxford | 1393 | 4.3% |
| bristol | 1024 | 3.2% |
Most occurring characters
| Value | Count | Frequency (%) |
| n | 29568 | |
| d | 27301 | |
| i | 25422 | |
| r | 25380 | |
| h | 22963 | |
| E | 19228 | |
| b | 19228 | |
| u | 19228 | |
| g | 19228 | |
| o | 7587 | 3.0% |
| Other values (11) | 35860 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 218933 | |
| Uppercase Letter | 32060 | 12.8% |
Most frequent character per category
| Value | Count | Frequency (%) |
| n | 29568 | |
| d | 27301 | |
| i | 25422 | |
| r | 25380 | |
| h | 22963 | |
| b | 19228 | |
| u | 19228 | |
| g | 19228 | |
| o | 7587 | 3.5% |
| e | 6755 | 3.1% |
| Other values (6) | 16273 |
| Value | Count | Frequency (%) |
| E | 19228 | |
| S | 5170 | 16.1% |
| O | 5128 | 16.0% |
| L | 1510 | 4.7% |
| B | 1024 | 3.2% |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 250993 |
Most frequent character per script
| Value | Count | Frequency (%) |
| n | 29568 | |
| d | 27301 | |
| i | 25422 | |
| r | 25380 | |
| h | 22963 | |
| E | 19228 | |
| b | 19228 | |
| u | 19228 | |
| g | 19228 | |
| o | 7587 | 3.0% |
| Other values (11) | 35860 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 250993 |
Most frequent character per block
| Value | Count | Frequency (%) |
| n | 29568 | |
| d | 27301 | |
| i | 25422 | |
| r | 25380 | |
| h | 22963 | |
| E | 19228 | |
| b | 19228 | |
| u | 19228 | |
| g | 19228 | |
| o | 7587 | 3.0% |
| Other values (11) | 35860 |
paye_adj
Categorical
| Distinct | 4 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 1.9 MiB |
| Other | |
|---|---|
| NW384000 | |
| BR442000 | 1414 |
| EE913000 | 799 |
Length
| Max length | 8 |
|---|---|
| Median length | 5 |
| Mean length | 5.511759201 |
| Min length | 5 |
Characters and Unicode
| Total characters | 176707 |
|---|---|
| Distinct characters | 17 |
| Distinct categories | 3 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | Other |
|---|---|
| 2nd row | Other |
| 3rd row | Other |
| 4th row | Other |
| 5th row | BR442000 |
| Value | Count | Frequency (%) |
| Other | 26591 | |
| NW384000 | 3256 | 10.2% |
| BR442000 | 1414 | 4.4% |
| EE913000 | 799 | 2.5% |
| Value | Count | Frequency (%) |
| other | 26591 | |
| nw384000 | 3256 | 10.2% |
| br442000 | 1414 | 4.4% |
| ee913000 | 799 | 2.5% |
Most occurring characters
| Value | Count | Frequency (%) |
| O | 26591 | |
| t | 26591 | |
| h | 26591 | |
| e | 26591 | |
| r | 26591 | |
| 0 | 16407 | |
| 4 | 6084 | 3.4% |
| 3 | 4055 | 2.3% |
| N | 3256 | 1.8% |
| W | 3256 | 1.8% |
| Other values (7) | 10694 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 106364 | |
| Uppercase Letter | 37529 | 21.2% |
| Decimal Number | 32814 | 18.6% |
Most frequent character per category
| Value | Count | Frequency (%) |
| 0 | 16407 | |
| 4 | 6084 | 18.5% |
| 3 | 4055 | 12.4% |
| 8 | 3256 | 9.9% |
| 2 | 1414 | 4.3% |
| 9 | 799 | 2.4% |
| 1 | 799 | 2.4% |
| Value | Count | Frequency (%) |
| O | 26591 | |
| N | 3256 | 8.7% |
| W | 3256 | 8.7% |
| E | 1598 | 4.3% |
| B | 1414 | 3.8% |
| R | 1414 | 3.8% |
| Value | Count | Frequency (%) |
| t | 26591 | |
| h | 26591 | |
| e | 26591 | |
| r | 26591 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 143893 | |
| Common | 32814 | 18.6% |
Most frequent character per script
| Value | Count | Frequency (%) |
| O | 26591 | |
| t | 26591 | |
| h | 26591 | |
| e | 26591 | |
| r | 26591 | |
| N | 3256 | 2.3% |
| W | 3256 | 2.3% |
| E | 1598 | 1.1% |
| B | 1414 | 1.0% |
| R | 1414 | 1.0% |
| Value | Count | Frequency (%) |
| 0 | 16407 | |
| 4 | 6084 | 18.5% |
| 3 | 4055 | 12.4% |
| 8 | 3256 | 9.9% |
| 2 | 1414 | 4.3% |
| 9 | 799 | 2.4% |
| 1 | 799 | 2.4% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 176707 |
Most frequent character per block
| Value | Count | Frequency (%) |
| O | 26591 | |
| t | 26591 | |
| h | 26591 | |
| e | 26591 | |
| r | 26591 | |
| 0 | 16407 | |
| 4 | 6084 | 3.4% |
| 3 | 4055 | 2.3% |
| N | 3256 | 1.8% |
| W | 3256 | 1.8% |
| Other values (7) | 10694 |
annual_salary
Real number (ℝ≥0)
| Distinct | 12677 |
|---|---|
| Distinct (%) | 39.5% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 200506481.3 |
|---|---|
| Minimum | 36 |
| Maximum | 6655018723 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 250.6 KiB |
Quantile statistics
| Minimum | 36 |
|---|---|
| 5-th percentile | 15404.925 |
| Q1 | 19212 |
| median | 24062 |
| Q3 | 36843 |
| 95-th percentile | 82273854 |
| Maximum | 6655018723 |
| Range | 6655018687 |
| Interquartile range (IQR) | 17631 |
Descriptive statistics
| Standard deviation | 1086622660 |
|---|---|
| Coefficient of variation (CV) | 5.419389203 |
| Kurtosis | 29.75904943 |
| Mean | 200506481.3 |
| Median Absolute Deviation (MAD) | 6277 |
| Skewness | 5.587776499 |
| Sum | 6.428237789 × 1012 |
| Variance | 1.180748805 × 1018 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) |
| 6655018723 | 799 | 2.5% |
| 82273854 | 99 | 0.3% |
| 18252 | 37 | 0.1% |
| 19032 | 34 | 0.1% |
| 17524 | 34 | 0.1% |
| 19500 | 33 | 0.1% |
| 20124 | 32 | 0.1% |
| 19552 | 32 | 0.1% |
| 19812 | 32 | 0.1% |
| 18824 | 29 | 0.1% |
| Other values (12667) | 30899 |
| Value | Count | Frequency (%) |
| 36 | 1 | |
| 203 | 1 | |
| 260 | 1 | |
| 286 | 1 | |
| 293 | 1 | |
| 362 | 1 | |
| 453 | 1 | |
| 470 | 1 | |
| 513 | 1 | |
| 518 | 1 |
| Value | Count | Frequency (%) |
| 6655018723 | 799 | |
| 6631218918 | 1 | < 0.1% |
| 6627807440 | 1 | < 0.1% |
| 6623537895 | 1 | < 0.1% |
| 6605416087 | 1 | < 0.1% |
| 6595418706 | 1 | < 0.1% |
| 6572530984 | 1 | < 0.1% |
| 6535381743 | 1 | < 0.1% |
| 6526726426 | 1 | < 0.1% |
| 6522356558 | 1 | < 0.1% |
salary_band_text_adj
Categorical
| Distinct | 6 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 3.0 MiB |
| £ yearly | |
|---|---|
| £. per month | |
| £ - range | |
| £. pw | |
| foreign_ccy |
Length
| Max length | 12 |
|---|---|
| Median length | 8 |
| Mean length | 8.958983157 |
| Min length | 4 |
Characters and Unicode
| Total characters | 287225 |
|---|---|
| Distinct characters | 24 |
| Distinct categories | 7 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 2 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | £ yearly |
|---|---|
| 2nd row | £ yearly |
| 3rd row | £. pw |
| 4th row | £ yearly |
| 5th row | £. per month |
| Value | Count | Frequency (%) |
| £ yearly | 12816 | |
| £. per month | 6380 | |
| £ - range | 4975 | 15.5% |
| £. pw | 4697 | 14.7% |
| foreign_ccy | 3162 | 9.9% |
| .BSD | 30 | 0.1% |
| Value | Count | Frequency (%) |
| £ | 28868 | |
| yearly | 12816 | |
| month | 6380 | 8.8% |
| per | 6380 | 8.8% |
| 4975 | 6.9% | |
| range | 4975 | 6.9% |
| pw | 4697 | 6.5% |
| foreign_ccy | 3162 | 4.4% |
| bsd | 30 | < 0.1% |
Most occurring characters
| Value | Count | Frequency (%) |
| 45198 | ||
| £ | 28868 | |
| y | 28794 | |
| e | 27333 | |
| r | 27333 | |
| a | 17791 | 6.2% |
| n | 14517 | 5.1% |
| l | 12816 | 4.5% |
| . | 11107 | 3.9% |
| p | 11077 | 3.9% |
| Other values (14) | 62391 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 193825 | |
| Space Separator | 45198 | 15.7% |
| Currency Symbol | 28868 | 10.1% |
| Other Punctuation | 11107 | 3.9% |
| Dash Punctuation | 4975 | 1.7% |
| Connector Punctuation | 3162 | 1.1% |
| Uppercase Letter | 90 | < 0.1% |
Most frequent character per category
| Value | Count | Frequency (%) |
| y | 28794 | |
| e | 27333 | |
| r | 27333 | |
| a | 17791 | |
| n | 14517 | |
| l | 12816 | |
| p | 11077 | 5.7% |
| o | 9542 | 4.9% |
| g | 8137 | 4.2% |
| m | 6380 | 3.3% |
| Other values (6) | 30105 |
| Value | Count | Frequency (%) |
| B | 30 | |
| S | 30 | |
| D | 30 |
| Value | Count | Frequency (%) |
| £ | 28868 |
| Value | Count | Frequency (%) |
| 45198 |
| Value | Count | Frequency (%) |
| . | 11107 |
| Value | Count | Frequency (%) |
| - | 4975 |
| Value | Count | Frequency (%) |
| _ | 3162 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 193915 | |
| Common | 93310 |
Most frequent character per script
| Value | Count | Frequency (%) |
| y | 28794 | |
| e | 27333 | |
| r | 27333 | |
| a | 17791 | |
| n | 14517 | |
| l | 12816 | |
| p | 11077 | 5.7% |
| o | 9542 | 4.9% |
| g | 8137 | 4.2% |
| m | 6380 | 3.3% |
| Other values (9) | 30195 |
| Value | Count | Frequency (%) |
| 45198 | ||
| £ | 28868 | |
| . | 11107 | 11.9% |
| - | 4975 | 5.3% |
| _ | 3162 | 3.4% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 258357 | |
| None | 28868 | 10.1% |
Most frequent character per block
| Value | Count | Frequency (%) |
| £ | 28868 |
| Value | Count | Frequency (%) |
| 45198 | ||
| y | 28794 | |
| e | 27333 | |
| r | 27333 | |
| a | 17791 | 6.9% |
| n | 14517 | 5.6% |
| l | 12816 | 5.0% |
| . | 11107 | 4.3% |
| p | 11077 | 4.3% |
| o | 9542 | 3.7% |
| Other values (13) | 52849 |
| Distinct | 487 |
|---|---|
| Distinct (%) | 1.5% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 67.49435434 |
|---|---|
| Minimum | 0 |
| Maximum | 695 |
| Zeros | 494 |
| Zeros (%) | 1.5% |
| Memory size | 250.6 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 3 |
| Q1 | 15 |
| median | 38 |
| Q3 | 92 |
| 95-th percentile | 230 |
| Maximum | 695 |
| Range | 695 |
| Interquartile range (IQR) | 77 |
Descriptive statistics
| Standard deviation | 77.30166006 |
|---|---|
| Coefficient of variation (CV) | 1.145305571 |
| Kurtosis | 5.460396295 |
| Mean | 67.49435434 |
| Median Absolute Deviation (MAD) | 29 |
| Skewness | 2.089231048 |
| Sum | 2163869 |
| Variance | 5975.546648 |
| Monotocity | Not monotonic |
| Value | Count | Frequency (%) |
| 1 | 543 | 1.7% |
| 11 | 536 | 1.7% |
| 3 | 533 | 1.7% |
| 2 | 523 | 1.6% |
| 10 | 521 | 1.6% |
| 9 | 518 | 1.6% |
| 8 | 513 | 1.6% |
| 7 | 510 | 1.6% |
| 22 | 510 | 1.6% |
| 6 | 502 | 1.6% |
| Other values (477) | 26851 |
| Value | Count | Frequency (%) |
| 0 | 494 | |
| 1 | 543 | |
| 2 | 523 | |
| 3 | 533 | |
| 4 | 486 | |
| 5 | 489 | |
| 6 | 502 | |
| 7 | 510 | |
| 8 | 513 | |
| 9 | 518 |
| Value | Count | Frequency (%) |
| 695 | 1 | |
| 691 | 1 | |
| 644 | 1 | |
| 637 | 1 | |
| 618 | 1 | |
| 617 | 1 | |
| 612 | 1 | |
| 589 | 1 | |
| 582 | 1 | |
| 564 | 1 |
workclass_adj
Categorical
| Distinct | 6 |
|---|---|
| Distinct (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 2.0 MiB |
| private | |
|---|---|
| public | |
| self-emp | |
| unknown | 1808 |
| wo-pay | 13 |
Length
| Max length | 8 |
|---|---|
| Median length | 7 |
| Mean length | 6.978009981 |
| Min length | 5 |
Characters and Unicode
| Total characters | 223715 |
|---|---|
| Distinct characters | 20 |
| Distinct categories | 2 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | public |
|---|---|
| 2nd row | self-emp |
| 3rd row | private |
| 4th row | private |
| 5th row | private |
| Value | Count | Frequency (%) |
| private | 22342 | |
| public | 4287 | 13.4% |
| self-emp | 3605 | 11.2% |
| unknown | 1808 | 5.6% |
| wo-pay | 13 | < 0.1% |
| never | 5 | < 0.1% |
| Value | Count | Frequency (%) |
| private | 22342 | |
| public | 4287 | 13.4% |
| self-emp | 3605 | 11.2% |
| unknown | 1808 | 5.6% |
| wo-pay | 13 | < 0.1% |
| never | 5 | < 0.1% |
Most occurring characters
| Value | Count | Frequency (%) |
| p | 30247 | |
| e | 29562 | |
| i | 26629 | |
| a | 22355 | |
| r | 22347 | |
| v | 22347 | |
| t | 22342 | |
| l | 7892 | 3.5% |
| u | 6095 | 2.7% |
| n | 5429 | 2.4% |
| Other values (10) | 28470 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 220097 | |
| Dash Punctuation | 3618 | 1.6% |
Most frequent character per category
| Value | Count | Frequency (%) |
| p | 30247 | |
| e | 29562 | |
| i | 26629 | |
| a | 22355 | |
| r | 22347 | |
| v | 22347 | |
| t | 22342 | |
| l | 7892 | 3.6% |
| u | 6095 | 2.8% |
| n | 5429 | 2.5% |
| Other values (9) | 24852 |
| Value | Count | Frequency (%) |
| - | 3618 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 220097 | |
| Common | 3618 | 1.6% |
Most frequent character per script
| Value | Count | Frequency (%) |
| p | 30247 | |
| e | 29562 | |
| i | 26629 | |
| a | 22355 | |
| r | 22347 | |
| v | 22347 | |
| t | 22342 | |
| l | 7892 | 3.6% |
| u | 6095 | 2.8% |
| n | 5429 | 2.5% |
| Other values (9) | 24852 |
| Value | Count | Frequency (%) |
| - | 3618 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 223715 |
Most frequent character per block
| Value | Count | Frequency (%) |
| p | 30247 | |
| e | 29562 | |
| i | 26629 | |
| a | 22355 | |
| r | 22347 | |
| v | 22347 | |
| t | 22342 | |
| l | 7892 | 3.5% |
| u | 6095 | 2.7% |
| n | 5429 | 2.4% |
| Other values (10) | 28470 |
| Distinct | 2 |
|---|---|
| Distinct (%) | 0.1% |
| Missing | 29033 |
| Missing (%) | 90.6% |
| Memory size | 1.3 MiB |
| 0.0 | |
|---|---|
| 1.0 | 240 |
Length
| Max length | 3 |
|---|---|
| Median length | 3 |
| Mean length | 3 |
| Min length | 3 |
Characters and Unicode
| Total characters | 9081 |
|---|---|
| Distinct characters | 3 |
| Distinct categories | 2 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 0.0 |
|---|---|
| 2nd row | 0.0 |
| 3rd row | 0.0 |
| 4th row | 0.0 |
| 5th row | 0.0 |
| Value | Count | Frequency (%) |
| 0.0 | 2787 | 8.7% |
| 1.0 | 240 | 0.7% |
| (Missing) | 29033 |
| Value | Count | Frequency (%) |
| 0.0 | 2787 | |
| 1.0 | 240 | 7.9% |
Most occurring characters
| Value | Count | Frequency (%) |
| 0 | 5814 | |
| . | 3027 | |
| 1 | 240 | 2.6% |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 6054 | |
| Other Punctuation | 3027 |
Most frequent character per category
| Value | Count | Frequency (%) |
| 0 | 5814 | |
| 1 | 240 | 4.0% |
| Value | Count | Frequency (%) |
| . | 3027 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 9081 |
Most frequent character per script
| Value | Count | Frequency (%) |
| 0 | 5814 | |
| . | 3027 | |
| 1 | 240 | 2.6% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 9081 |
Most frequent character per block
| Value | Count | Frequency (%) |
| 0 | 5814 | |
| . | 3027 | |
| 1 | 240 | 2.6% |
Pearson's r
The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
Spearman's ρ
The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
Kendall's τ
Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
Phik (φk)
Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.Cramér's V (φc)
Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.First rows
| age | marital_status | education | occupation_level | education_num | familiarity_FB | view_FB | interested_insurance | created_account | has_married | education_order | job_title_top_10 | company_email_address | hours_per_week | capital_gain | capital_loss | native_country | demographic_characteristic | town_adj | paye_adj | annual_salary | salary_band_text_adj | total_months_with_employer | workclass_adj | target | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 39 | Never-married | Bachelors | 1 | 17 | 7 | 9 | 0 | No | 0 | 11 | Other | jones.com | 40 | 2174 | 0 | United Kingdom | 77516 | Edinburgh | Other | 18109.0 | £ yearly | 246 | public | 0 |
| 1 | 50 | Married-civ-spouse | Bachelors | 4 | 17 | 9 | 6 | 1 | No | 1 | 11 | Other | Other | 13 | 0 | 0 | United Kingdom | 83311 | Leeds | Other | 16945.0 | £ yearly | 337 | self-emp | 0 |
| 2 | 38 | Divorced | HS-grad | 12 | 12 | 5 | 4 | 1 | No | 0 | 9 | Other | Other | 40 | 0 | 0 | United Kingdom | 215646 | Edinburgh | Other | 37908.0 | £. pw | 173 | private | 0 |
| 3 | 53 | Married-civ-spouse | 11th | 1 | 9 | 9 | 2 | 0 | No | 1 | 7 | Other | Other | 40 | 0 | 0 | United Kingdom | 234721 | Edinburgh | Other | 19087.0 | £ yearly | 390 | private | 0 |
| 4 | 28 | Married-civ-spouse | Bachelors | 12 | 17 | 8 | 9 | 1 | No | 1 | 11 | Other | Other | 40 | 0 | 0 | Sweden | 338409 | Swindon | BR442000 | 32892.0 | £. per month | 42 | private | 0 |
| 5 | 37 | Married-civ-spouse | Masters | 7 | 18 | 7 | 5 | 0 | No | 1 | 12 | Other | Other | 40 | 0 | 0 | United Kingdom | 284582 | Other | Other | 24336.0 | £. pw | 47 | private | 0 |
| 6 | 49 | Married-spouse-absent | 9th | 1 | 6 | 1 | 2 | 0 | No | 1 | 5 | Other | Other | 16 | 0 | 0 | Jamaica | 160187 | Edinburgh | Other | 16392.0 | £. per month | 30 | private | 0 |
| 7 | 52 | Married-civ-spouse | HS-grad | 13 | 12 | 9 | 7 | 0 | No | 1 | 9 | Other | Other | 45 | 0 | 0 | United Kingdom | 209642 | Other | Other | 37407.0 | £ yearly | 29 | self-emp | 0 |
| 8 | 31 | Never-married | Masters | 12 | 18 | 6 | 3 | 1 | Yes | 0 | 12 | Other | Other | 50 | 14084 | 0 | United Kingdom | 45781 | Swindon | NW384000 | 39744.0 | £. per month | 52 | private | 1 |
| 9 | 42 | Married-civ-spouse | Bachelors | 4 | 17 | 5 | 3 | 1 | Yes | 1 | 11 | Other | Other | 40 | 5178 | 0 | United Kingdom | 159449 | Other | Other | 16785.0 | £ yearly | 1 | private | 1 |
Last rows
| age | marital_status | education | occupation_level | education_num | familiarity_FB | view_FB | interested_insurance | created_account | has_married | education_order | job_title_top_10 | company_email_address | hours_per_week | capital_gain | capital_loss | native_country | demographic_characteristic | town_adj | paye_adj | annual_salary | salary_band_text_adj | total_months_with_employer | workclass_adj | target | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 32050 | 49 | Married-civ-spouse | HS-grad | 8 | 12 | 8 | 1 | 1 | unknown | 1 | 9 | Conservator, museum/gallery | Other | 40 | 5013 | 0 | United Kingdom | 66385 | Edinburgh | Other | 17374400.0 | foreign_ccy | 120 | private | None |
| 32051 | 22 | Never-married | Bachelors | 7 | 17 | 3 | 6 | 0 | unknown | 0 | 11 | Other | Other | 30 | 1055 | 0 | United Kingdom | 205940 | Edinburgh | Other | 17700.0 | £. per month | 17 | private | None |
| 32052 | 51 | Married-civ-spouse | Some-college | 6 | 13 | 10 | 7 | 1 | unknown | 1 | 10 | Event organiser | Other | 50 | 0 | 0 | United Kingdom | 260938 | Swindon | NW384000 | 22404.0 | £. per month | 140 | self-emp | None |
| 32053 | 33 | Married-civ-spouse | HS-grad | 10 | 12 | 6 | 8 | 0 | unknown | 1 | 9 | Other | Other | 40 | 3411 | 0 | United Kingdom | 60567 | Edinburgh | Other | 26627.0 | £ yearly | 103 | private | None |
| 32054 | 23 | Never-married | Bachelors | 8 | 17 | 9 | 5 | 0 | unknown | 0 | 11 | Other | Other | 50 | 0 | 0 | United Kingdom | 335067 | Edinburgh | Other | 25869.5 | £ - range | 25 | private | None |
| 32055 | 34 | Never-married | HS-grad | 4 | 12 | 7 | 4 | 1 | unknown | 0 | 9 | Other | Other | 30 | 0 | 0 | United Kingdom | 331126 | Edinburgh | Other | 19488.0 | £. per month | 38 | private | None |
| 32056 | 53 | Divorced | 12th | 3 | 10 | 8 | 7 | 1 | unknown | 0 | 8 | Other | Other | 40 | 0 | 0 | United Kingdom | 156612 | Edinburgh | Other | 15116.0 | £ yearly | 167 | private | None |
| 32057 | 44 | Married-civ-spouse | Bachelors | 6 | 17 | 3 | 4 | 1 | unknown | 1 | 11 | Environmental health practitioner | Other | 45 | 0 | 0 | United Kingdom | 188436 | Swindon | NW384000 | 23733.0 | £ yearly | 163 | private | None |
| 32058 | 60 | Widowed | Some-college | 6 | 13 | 6 | 2 | 1 | unknown | 1 | 10 | Event organiser | Other | 40 | 0 | 0 | United Kingdom | 227468 | Edinburgh | Other | 18617.0 | £ yearly | 107 | private | None |
| 32059 | 55 | Married-civ-spouse | Some-college | 8 | 13 | 7 | 5 | 1 | unknown | 1 | 10 | Other | Other | 38 | 0 | 0 | United Kingdom | 183580 | Swindon | Other | 22185.0 | £ yearly | 247 | private | None |